REMARKS 



The application is believed to be in condition for allowance because the 
claims are novel and non-obvious over the cited art. The following paragraphs 
provide the justification for these beliefs. In view of the following reasoning for 
allowance, the applicant hereby respectfully requests further examination and 
reconsideration of the subject application. 

Claims 1-32 are pending in this application. 

Examiner Interview 

An Examiner Interview was conducted between Examiner David Rashid and 
the applicants' representative, Katrina Lyon, on February 25, 2009. A proposed 
Amendment was discussed. In particular, the real-time claim limitation was 
discussed. The Examiner stated that the claim amendments appeared to overcome 
the cited art, but he would have to study the amendments and art further. 
Compliance with the new interpretation of 35 USC 101 was also discussed. 

Response to Arguments 

The Examiner states that "a predefined set of classes" is the same as " a 
number of classes" and that in near real-time is highly subjective because there is no 
value assigned. However, one with ordinary skill in the art would recognize that 
a "predefined set of classes" is not the same as a preferred number of 
classes, as the applicants claim. In the applicants' claimed invention it is not 
necessary to define what type of a class is sought, all that is needed is the 
preferred number of classes sought (e.g. "3"), which requires much less 
information to specify than a class itself (e.g. car, person, flower). Although 
one may be able to compute the number of classes sought from the type of 
classes, having to input the type of classes requires more information and is 
more computationally expensive than just having to provide the number of 
classes. Hence the applicants' claimed invention has advantages not taught 
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by Foote or the other cited art. Furthermore, Claim 1 includes the limitation of 
automatically decomposing the image sequence into the preferred number of 
classes of objects in near real-time or processing the provided image sequence and 
computing the single set of model parameters at a same rate that the image 
sequence is provided. As stated by the Examiner, Foote does not teach 
automatically decomposing the image sequence into the preferred number of 
classes of objects in near real-time because Foote segments a full video into 
individual presentations based on the extent of each presenter's speech. 
(Abstract) Hence, Foote can only segment a video file with corresponding audio 
after it has been recorded, not in real-time as it is being input. The applicants do 
not believe that the term in real-time in the applicant's claims is subjective 
because applicant's specification clearly states this means processing data 
and learning generative models at substantially the same rate the input data is 
received (see summary). The claims must be read in light of the specification, 
therefore the term "real-time" is not highly subjective. However, to speed 
prosecution the applicants have amended the independent claims to include 
the language from the specification. 

The 35 USC 112 Rejection of Claims 1-32. 

Claims 1-32 stand rejected under 35 USC 112, second paragraph, as being 
indefinite for failing to particularly poing out and distinctly claim the subject matter 
which applicant regards as the invention. This was because the term "the mean 
visual appearance and variance of each class" lacks antecedent basis. In response 
the applicants have amended Claims 1 and 23 to change "the mean visual 
appearance" to "a visual mean appearance". It is believed that this amendment 
renders the claims definite. Reconsideration of Claims 1-32 is respectfully 
requested. 

The Section 101 Rejection of Claims 1-32 

Claims 1-32 were rejected under 35 USC 101 because the claims recite a series 
of steps or acts to be performed, a statutory "process" under 35 USC 101 must (1) be 
tied to another statutory category (such as a particular apparatus), or (2) transform 
underlying subject matter (such as an article or material) to a different state or thing. It 
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was alleged that the rejected claims neither transform the underlying subject matter nor 
postitively tie to another statutory category that accomplishes the claimed method 
steps, and therefore do not qualify as a statutory process. 

The applicants have amended independent claims 1 and 23 to state that a 
computer is performing the steps. Therefore, the claims are tied to a statutory process 
under 35 USC 101. 

In view of the amended claims, it is believed that claims 1-32 are patentable 
under 35 USC 1 01 . Therefore, it is respectfully requested that the rejection of these 
claims be reconsidered. 

The Rejection of Claims 1-3, 5-6, 14. 18-19 and 23-24 Under 35 USC 102(b). 

Claims 1-3, 5-6, 14, 18-19 and 23-34 stand rejected under 35 USC 102(b) as 
being anticipated by Foote et al. U.S. Patent No. 6,404,925 (hereinafter Foote). It was 
contended in the above-identified Office Action that Foote teaches all the elements of 
the rejected claims. The applicants respectfully disagree with this contention of 
anticipation. 

The applicants claim a technique that can extract objects from an image 
sequence using the constraints on their motion and also performs tracking while the 
appearance models are learned. The technique operates in near real time, 
processing data and learning generative models at substantially the same rate 
the input data is received. (Summary) 

The claimed technique tries to recognize patterns in time (e.g., finding 
possibly recurring scenes or objects in an image sequence), and in order to do 
so attempts to model the process that could have generated the pattern. It 

uses the possible states or classes, the probability of each of the classes being in 
each of the states at a given time and a state transition matrix that gives the 
probability of a given state given that state at a previous time. The states further 
may include observable states and hidden states. In such cases the observed 
sequence of states is probabilistically related to the hidden process. The processes 
are modeled using a transformed Hidden Markov model (THHM) where there is an 
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underlying hidden Markov process changing over time, and a set of observable 
states which are related somehow to the hidden states. The connections between 
the hidden states and the observable states represent the probability of generating a 
particular observed state given that the Markov process is in a particular hidden 
state. All probabilities entering an observable state will sum to 1 . (Summary) 



The number of classes of objects and an image sequence is all that must be 
provided in order to extract objects from an image sequence and learn their 
generative model (e.g., a model of how the observed data could have been 
generated). Given this information, probabilistic inference and learning are 
used to compute a single set of model parameters that represent either the 
video sequence processed to that point or the entire video sequence. These 
model parameters include the mean appearance and variance of each class. 
The probability of each class is also determined. (Summary) 



More specifically, the applicants claim, 

"A system for automatically decomposing an image sequence, comprising a 
computer-readable storage medium storing a program that when executed 
causes: 

a computer to perform the following process actions, 

providing an image sequence of at least one image frame of a scene; 

providing only a preferred number of classes of objects to be identified 
within the image sequence; 

automatically decomposing the image sequence into the preferred number of 
classes of objects, using probabilistic inference and learning to compute a single 
set of model parameters comprising a mean visual appearance and variance of 
each class in the image sequence, processing the provided image sequence 
and computing the single set of model parameters at a same rate that the 
image sequence is provided." 
d, 

"A computer-implemented process for automatically generating a 
representation of an object in at least one image sequence, comprising a 
computer-readable storage medium storing a program: 

that when executed causes a computer to, 

acquire at least one image sequence, each image sequence having at 
least one image frame; 

automatically decompose each image sequence into a generative 
model with each generative model comprising a set of model parameters 
comprising a mean visual appearance and variance of each class in the 
image sequence being decomposed, using an expectation-maximization 
analysis that employs a Viterbi analysis, wherein each generative model is 
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computed at a same rate that the at least one image sequence is 
acquired." 

Foote discloses methods for segmenting audio-video recording of meetings 
containing slide presentations by one or more speakers. These segments serve as 
indexes into the recorded meeting. If an agenda is provided for the meeting, these 
segments can be labeled using information from the agenda. The system 
automatically detects intervals of video that correspond to presentation slides. 
Under the assumption that only one person is speaking during an interval 
when slides are displayed in the video, possible speaker intervals are 
extracted from the audio soundtrack by finding these regions. Since the same 
speaker may talk across multiple slide intervals, the acoustic data from these 
intervals is clustered to yield an estimate of the number of distinct speakers 
and their order. Clustering the audio data from these intervals yields an 
estimate of the number of different speakers and their order. Merged clustered 
audio intervals corresponding to a single speaker are then used as training 
data for a speaker segmentation system. Using speaker identification techniques, 
the full video is then segmented into individual presentations based on the 
extent of each presenter's speech . (Abstract) 

Foote does not teach the applicants' claimed preferred number of 
classes of objects to be identified within the image sequence or automatically 
decomposing the image sequence into the preferred number of classes of 
objects, processing data and learning generative models at substantially the 
same rate the input data is received. 

Granted, as to Claim 1, the Office Action states that providing an image 
sequence of at least one image frame is taught in FIG. 2, element 201 and FIG. 3, 
elements 301-308. But FIG. 3 refers to training images for training the Foote system 
shown in FIG. 2, not an image frame of element 201 . Additionally, the Office Action 
states that providing a preferred number of classes of objects is taught as a "pre- 
defined set of classes" in Col. 5, lines 14-16 to be identified within the image 
sequence. But a "predefined set of classes" is not the same as a preferred 
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number of classes, as the applicants claim. In the applicants' claimed invention 
it is not necessary to define what type of a class is sought, all that is needed is the 
preferred number of classes sought, which requires much less information to specify 
than a class itself. Furthermore, Claim 1 includes the limitation of automatically 
decomposing the image sequence into the preferred number of classes of objects, 
processing data and learning generative models at substantially the same rate 
the input data is received. Cited Column 5, lines 14-16, does not teach 
automatically decomposing the image sequence into the preferred number of 
classes of objects in near real-time by processing data and learning generative 
models at substantially the same rate the input data is received. Nothing at all 
is stated in this paragraph regarding processing in near real-time. In fact, as stated 
by the Examiner, Foote does not teach automatically decomposing the image 
sequence into the preferred number of classes of objects in near real-time 
(processing data and learning generative models at substantially the same 
rate the input data is received) because Foote segments a full video into 
individual presentations based on the extent of each presenter's speech . 
(Abstract) Hence, Foote can only segment a video file with corresponding audio after 
it has been recorded, not as the data is being acquired or input. 

As for Claim 23, the Office Action states that providing an image sequence of 
at least one image frame is taught in FIG. 2, element 201 and FIG. 3, elements 301- 
308. But FIG. 3 refers to training images for training the Foote system shown in FIG. 
2, not an image frame of element 201 . Furthermore, the Office Action states that 
automatically decomposing each image sequence into a generative model is taught 
in FIG. 2, elements 202-205; Col. 5, line 65- Col. 6 line 2, but this passage does not 
teach automatically decomposing each image sequence into a generative model. It 
merely appears to determine video features in image frames and using these 
features to determine which of the predefined classes a frame belongs to. It does 
not teach automatically decomposing each image sequence into a generative model 
(e.g., a model of how the observed data could have been generated) with each 
generative model including a set of model parameters that represent at least one 
object class for each image sequence using an expectation-maximization analysis 
that employs a Viterbi analysis, wherein each generative model is computed at a 
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same rate that the at least one image sequence is acquired. Nothing in Foote 
teaches the decomposing of an image sequence by processing data and learning 
generative models at substantially the same rate the input data is received. 

Thus, the applicants have claimed an element not taught in Foote, namely 
inputting a number of classes of objects to be identified within the image sequence 
or automatically decomposing the image sequence into the preferred number of 
classes of objects by processing data and learning generative models at 
substantially the same rate the input data is received. Nor does Foote teach 
model parameters comprising the mean visual appearance and variance of each 
class in the image sequence. As such, the rejected claims, as amended, are not 
anticipated by the reference. It is, therefore, respectfully requested that the rejection 
of Claims 1-3, 5-6, 14, 18-19 and 23-34 be reconsidered based on the above-quoted 
distinguishing claim language. 

The 35 USC 103(a) Rejection of Claims 4. 7 and 27, 

Claims 4, 7 and 27 were rejected under 35 USC 103(a) as unpatentable over 
Foote, in view of Petrovic et al ( Transformed Hidden Markov Models; Estimating Mixture 
Models of Images and Inferring Spatial Transformations in Video Sequences, Computer 
Visions and Pattern Recognition, 2000, Vol. 2, pg 16-33), hereinafter Petrovic. The 
Office Action contended that Foote teaches all of the limitations of Claims 4, 7 and 27, 
except that Foote does not teach a model that employs a latent image and a translation 
variable in learning each object class, nor does Foote teach using a latent image and a 
translation variable in filling in hidden variables. However, the Office Action contended 
that Petrovic teaches these features, rendering Claims 4, 7 and 27 obvious. The 
applicants respectfully disagree with this contention of obviousness. 

In order to deem the applicant's claimed invention unpatentable under 35 USC 
103, a prima facie showing of obviousness must be made. To make a prima facie 
showing of obviousness, all of the claimed elements of an applicant's invention must be 
considered, especially when they are missing from the prior art. If a claimed element is 
not taught in the prior art and has advantages not appreciated by the prior art, then no 
prima facie case of obviousness exists. The Federal Circuit court has stated that it was 
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error not to distinguish claims over a combination of prior art references where a 
material limitation in the claimed system and its purpose was not taught therein (In Re 
Fine, 837 F.2d 107, 5 USPQ2d 1596 (Fed. Cir. 1988)). 



As discussed above, the applicants claim, 

"A system for automatically decomposing an image sequence, comprising 
a computer-readable storage medium storing a program that when executed 
causes: 

a computer to perform the following process actions, 

providing an image sequence of at least one image frame of a scene; 

providing only a preferred number of classes of objects to be 
identified within the image sequence; 

automatically decomposing the image sequence into the preferred number 
of classes of objects, using probabilistic inference and learning to compute a 
single set of model parameters comprising a mean visual appearance and 
variance of each class in the image sequence, processing the provided 
image sequence and computing the single set of model parameters at a 
same rate that the image sequence is provided." 

And, 

"A computer-implemented process for automatically generating a 
representation of an object in at least one image sequence, comprising a 
computer-readable storage medium storing a program: 
that when executed causes a computer to, 

acquire at least one image sequence, each image sequence having 
at least one image frame; 

automatically decompose each image sequence into a generative 
model with each generative model comprising a set of model parameters 
comprising a mean visual appearance and variance of each class in the 
image sequence being decomposed, using an expectation-maximization 
analysis that employs a Viterbi analysis, wherein each generative model 
is computed at a same rate that the at least one image sequence is 
acquired." 

As discussed above Foote does not teach the applicants' claimed 
preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the Image sequence into the 
preferred number of classes of objects, processing data and learning 
generative models at substantially the same rate the input data is received, 
e..q. in near real-time. Petrovic also does not teach these features. 



Accordingly, Foote in combination with Petrovic does not teach the applicant's 
claimed preferred number of classes of objects to be identified within the image 
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sequence or automatically decomposing the image sequence into the preferred 
number of classes of objects in near real-time, processing data and learning 
generative models at substantially the same rate the input data is received. 
Nor does Foote in combination with Petrovic recognize the advantages of the 
applicants' claimed invention. Namely, Foote in combination with Petrovic does not 
teach allowing video sequences to be decomposed into a preferred number of 
classes in real-time with a minimal amount of input data. Thus, the applicants have 
claimed elements not taught in the cited art and which have advantages not 
recognized therein. Accordingly, no prima facie case of obviousness has been 
established in accordance with the holding of In Re Fine. This lack of prima facie 
showing of obviousness means that the rejected claims are patentable under 35 
USC 103 over Foote in view of Petrovic. As such, it is respectfully requested that 
Claims 4, 7 and 27 be allowed based on the previously-quoted claim language. 

The 35 USC 103(a) Rejection of Claims 8-10. 13. 15-17 and 28-31. 

Claims 8-10, 13, 15-17 and 28-31 were rejected under 35 USC 103(a) as 
unpatentable over Foote in view of Dellaert (The Expectation Maximization Algorithm, 
College of Computing, Georgia Institute of Technology, Technical Report Number GIT- 
GVU-02-20, 2/2002), hereinafter referred to as Dellaert. The Office Action contended 
that Foote teaches all of the limitations of Claims 8-10, 13, 15-17 and 28-31, except that 
Foote does not directly teach various computations in the expectation step of the 
generalized expectation-maximization parameters. However, the Office Action 
contended that Dellaert teaches these features, rendering Claims 8-10, 13, 15-17 and 
28-31 obvious. The applicants respectfully disagree with this contention of 
obviousness. 

As discussed above, the applicants claim, 

"A system for automatically decomposing an image sequence, comprising a 
computer-readable storage medium storing a program that when executed 
causes: 

a computer to perform the following process actions, 

providing an image sequence of at least one image frame of a scene; 

providing only a preferred number of classes of objects to be identified 
within the image sequence; 

automatically decomposing the image sequence into the preferred number of 
classes of objects, using probabilistic inference and learning to compute a single 
set of model parameters comprising a mean visual appearance and variance of 
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each class in the image sequence, processing the provided image sequence 
and computing the single set of model parameters at a same rate that the 
image sequence is provided." 

And, 

"A computer-implemented process for automatically generating a 
representation of an object in at least one image sequence, comprising a computer- 
readable storage medium storing a program. 

that when executed causes a computer to, 

acquire at least one image sequence, each image sequence having at least 
one image frame; 

automatically decompose each image sequence into a generative model 
with each generative model comprising a set of model parameters comprising a 
mean visual appearance and variance of each class in the image sequence 
being decomposed, using an expectation-maximization analysis that employs a 
Viterbi analysis, wherein each generative model is computed at a same rate 
that the at least one image sequence is acquired." 

As discussed above Foote does not teach the applicants' claimed 
preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the image sequence into the 
preferred number of classes of objects, processing data and learning 
generative models at substantially the same rate the input data is received. 
Dellaert also does not teach these features. 



Accordingly, Foote in combination with Dellaert does not teach the applicant's 
claimed preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the image sequence into the preferred 
number of classes of objects, processing data and learning generative models at 
substantially the same rate the input data is received. Nor does Foote teach 
automatically decomposing each image sequence into a generative model including 
a set of model parameters including the mean visual appearance and variance of 
each class in the image sequence using an expectation-maximization analysis 
that employs a Viterbi analysis wherein each generative model is computed at a 
same rate that the at least one image sequence is acquired. Nor does Foote in 
combination with Dellaert recognize the advantages of the applicants' claimed 
invention. Namely, Foote in combination with Dellaert does not teach allowing video 
sequences to be decomposed into a preferred number of classes in real-time. Thus, 
the applicants have claimed elements not taught in the cited art and which have 
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advantages not recognized therein. Accordingly, no prima facie case of 
obviousness has been established in accordance with the holding of In Re Fine. 
This lack of prima facie showing of obviousness means that the rejected claims are 
patentable under 35 USC 103 over Foote in view of Dellaert. As such, it is 
respectfully requested that Claims 8-10, 13, 15-17 and 28-31 be allowed based on 
the previously-quoted claim language. 



The 35 USC 103fa) Rejection of Claims 11-12. 

Claims 11-12 were rejected under 35 USC 103(a) as unpatentable over Foote, in 
view of Dellaert, in further view of Eberman et al., U.S. Patent No. 5,925,065, herein 
after Eberman. The Office Action contended that Foote and Dellaert teach all of the 
limitations of Claims 11-12, except that Foote and Dellaert do not directly teach 
accelerating the expectation step using a FFT-based inference analysis. However, the 
Office Action contended that Eberman teaches this feature, rendering Claims 11-12 
obvious. The applicants respectfully disagree with this contention of obviousness. 



As discussed above, the applicants claim, 

"A system for automatically decomposing an image sequence, comprising a 
computer-readable storage medium storing a program that when executed 
causes: 

a computer to perform the following process actions, 

providing an image sequence of at least one image frame of a scene; 

providing only a preferred number of classes of objects to be identified 
within the image sequence; 

automatically decomposing the image sequence into the preferred number of 
classes of objects, using probabilistic inference and learning to compute a single 
set of model parameters comprising a mean visual appearance and variance of 
each class in the image sequence, processing the provided image sequence 
and computing the single set of model parameters at the same rate that the 
image sequence is provided." 

As discussed above Foote does not teach the applicants' claimed 
preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the image sequence into the 
preferred number of classes of objects in near real-time, processing the 
provided image sequence and computing the single set of model parameters 
at the same rate that the image sequence is provided. Dellaert and Eberman 
also do not teach these features. 
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Accordingly, Foote in combination with Dellaert and Eberman do not teach 
the applicant's claimed preferred number of classes of objects to be identified within 
the image sequence or automatically decomposing the image sequence into the 
preferred number of classes of objects in near real-time by processing the 
provided image sequence and computing the single set of model parameters 
at the same rate that the image sequence is provided. Nor does Foote in 
combination with Dellaert and Eberman recognize the advantages of the applicants' 
claimed invention. Namely, Foote in combination with Dellaert and Eberman does 
not teach allowing video sequences to be decomposed into a preferred number of 
classes in real-time. Thus, the applicants have claimed elements not taught in the 
cited art and which have advantages not recognized therein. Accordingly, no prima 
facie case of obviousness has been established in accordance with the holding of In 
Re Fine. This lack of prima facie showing of obviousness means that the rejected 
claims are patentable under 35 USC 103 over Foote in view of Dellaert. As such, it is 
respectfully requested that Claims 11-12 be allowed based on the previously-quoted 
claim language. 

The 35 USC 103(a) Rejection of Claims 20-21 and 25-26. 

Claims 20-21 and 25-26 were rejected under 35 USC 103(a) as unpatentable 
over Foote, in view of Jojic et al (Learning Flexible Sprites in Video Layers, Proc. Of 
IEEE Conf. on Computer Vision and Pattern Recognition, 2001, pg. 1-8). The Office 
Action contended that Foote teaches all of the limitations of claims, except that Foote 
does not various model parameters of the applicants' claimed invention. However, the 
Office Action contended that Jojic teaches these features, rendering Claims 20-21 and 
25-26 obvious. The applicants respectfully disagree with this contention of 
obviousness. 

As discussed above, the applicants claim, 

"A computer-implemented process for automatically generating a 
representation of an object in at least one image sequence, comprising a 
computer-readable storage medium storing a program: 

that when executed causes a computer to, 

acquire at least one image sequence, each image sequence having 
at least one image frame; 
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automatically decompose each image sequence into a generative 
model with each generative model comprising a set of model parameters 
comprising a mean visual appearance and variance of each class in the 
image sequence being decomposed, using an expectation-maximization 
analysis that employs a Viterbi analysis, wherein each generative model 
is computed at the same rate that the at least one image sequence is 
acquired." 

As discussed above Foote does not teach the applicants' claimed 
preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the image sequence into the 
preferred number of classes of objects in near real-time, processing the 
provided image sequence and computing the single set of model parameters 
at the same rate that the image sequence is provided. Joiic also does not 
teach these features. 

Accordingly, Foote in combination with Jojic does not teach the applicant's 
Claimed preferred number of classes of objects to be identified within the image 
sequence or automatically decomposing the image sequence into the preferred 
number of classes of objects in near real-time, processing the provided image 
sequence and computing the single set of model parameters at the same rate 
that the image sequence is provided. Nor does Foote in combination with Jojic 
recognize the advantages of the applicants' claimed invention. Namely, Foote in 
combination with Jojic does not teach allowing video sequences to be decomposed 
into a preferred number of classes in real-time. Thus, the applicants have claimed 
elements not taught in the cited art and which have advantages not recognized 
therein. Accordingly, no prima facie case of obviousness has been established in 
accordance with the holding of In Re Fine. This lack of prima facie showing of 
obviousness means that the rejected claims are patentable under 35 USC 103 over 
Foote in view of Petrovic. As such, it is respectfully requested that Claims 20-21 and 
25-26 be allowed based on the previously-quoted claim language. 

The 35 USC 103(a) Rejection of Claim 32. 

Claim 32 was rejected under 35 USC 103(a) as unpatentable over Foote, in view 
Eberman. The Office Action contended that Foote and Eberman teach all of the 
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limitations of Claim 1 1 which recites identical features to Claim 32, and Claim 32 is thus 
obvious with the reason as previously described for Claim 1 1 . The applicants 
respectfully disagree with this contention of obviousness. 



As discussed above, the applicants claim, 

"A computer-implemented process for automatically generating a 
representation of an object in at least one image sequence, comprising a 
computer-readable storage medium storing a program: 
that when executed causes a computer to, 

acquire at least one image sequence, each image sequence having 
at least one image frame; 

automatically decompose each image sequence into a generative 
model with each generative model comprising a set of model parameters 
comprising a mean visual appearance and variance of each class in the 
image sequence being decomposed, using an expectation-maximization 
analysis that employs a Viterbi analysis, wherein each generative model 
is computed at the same rate that the at least one image sequence is 
acquired." 

As discussed above Foote does not teach in near-real time automatically 
decomposing each image sequence into a generative model using an 
expectation-maximization analysis that employs a Viterbi analysis, wherein 
each generative model is computed at the same rate that the image sequence 
is acquired. Eberman also does not teach these features. 



Accordingly, Foote in combination with Eberman does not teach the 
applicant's claimed preferred number of classes of objects to be identified within the 
image sequence or automatically decomposing geach image sequence into a 
generative model with each generative model comprising a set of model parameters 
comprising a mean visual appearance and variance of each class in the image 
sequence being decomposed, using an expectation-maximization analysis that 
employs a Viterbi analysis, wherein each generative model is computed at the 
same rate that the at least one image sequence is acquired. Nor does Foote in 
combination with Eberman recognize the advantages of the applicants' claimed 
invention. Namely, Foote in combination with Eberman does not teach allowing 
video sequences to be decomposed into a preferred number of classes in real-time. 
Thus, the applicants have claimed elements not taught in the cited art and which 
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have advantages not recognized therein. Accordingly, no prima facie case of 
obviousness has been established in accordance with the holding of In Re Fine. 
This lack of prima facie showing of obviousness means that the rejected claims are 
patentable under 35 USC 103 over Foote in view of Eberman. As such, it is 
respectfully requested that Claim 32 be allowed based on the previously-quoted 
claim language. 

The applicants hereby respectfully request reconsideration of the subject 
application and allowance of Claims 1-32 at an early date. 

LYON & HARR, LLP Re^ectfully^submitted, 
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